13 research outputs found

    On multi-class learning through the minimization of the confusion matrix norm

    Full text link
    In imbalanced multi-class classification problems, the misclassification rate as an error measure may not be a relevant choice. Several methods have been developed where the performance measure retained richer information than the mere misclassification rate: misclassification costs, ROC-based information, etc. Following this idea of dealing with alternate measures of performance, we propose to address imbalanced classification problems by using a new measure to be optimized: the norm of the confusion matrix. Indeed, recent results show that using the norm of the confusion matrix as an error measure can be quite interesting due to the fine-grain informations contained in the matrix, especially in the case of imbalanced classes. Our first contribution then consists in showing that optimizing criterion based on the confusion matrix gives rise to a common background for cost-sensitive methods aimed at dealing with imbalanced classes learning problems. As our second contribution, we propose an extension of a recent multi-class boosting method --- namely AdaBoost.MM --- to the imbalanced class problem, by greedily minimizing the empirical norm of the confusion matrix. A theoretical analysis of the properties of the proposed method is presented, while experimental results illustrate the behavior of the algorithm and show the relevancy of the approach compared to other methods

    The Multi-Task Learning View of Multimodal Data

    No full text
    International audienceWe study the problem of learning from multiple views using kernel methods in a supervised setting. We approach this problem from a multi-task learning point of view and illustrate how to capture the interesting multimodal structure of the data using multi-task kernels. Our analysis shows that the multi-task perspective offers the flexibility to design more efficient multiple-source learning algorithms, and hence the ability to exploit multiple descriptions of the data. In particular, we formulate the multimodal learning framework using vector-valued reproducing kernel Hilbert spaces, and we derive specific multi-task kernels that can operate over multiple modalities. Finally, we analyze the vector-valued regularized least squares algorithm in this context, and demonstrate its potential in a series of experiments with a real-world multimodal data set

    Learning from Imbalanced Datasets with Cross-View Cooperation-Based Ensemble Methods

    No full text
    International audienc

    Greedy Methods, Randomization Approaches, and Multiarm Bandit Algorithms for Efficient Sparsity-Constrained Optimization

    Get PDF
    International audienceSeveral sparsity-constrained algorithms such as Orthogonal Matching Pursuit or the Frank-Wolfe algorithm with sparsity constraints work by iteratively selecting a novel atom to add to the current non-zero set of variables. This selection step is usually performed by computing the gradient and then by looking for the gradient component with maximal absolute entry. This step can be computationally expensive especially for large-scale and high-dimensional data. In this work, we aim at accelerating these sparsity-constrained optimization algorithms by exploiting the key observation that, for these algorithms to work, one only needs the coordinate of the gradient's top entry. Hence, we introduce algorithms based on greedy methods and randomization approaches that aim at cheaply estimating the gradient and its top entry. Another of our contribution is to cast the problem of finding the best gradient entry as a best arm identification in a multi-armed bandit problem. Owing to this novel insight, we are able to provide a bandit-based algorithm that directly estimates the top entry in a very efficient way. Theoretical observations stating that the resulting inexact Frank-Wolfe or Orthogonal Matching Pursuit algorithms act, with high probability, similarly to their exact versions are also given. We have carried out several experiments showing that the greedy deterministic and the bandit approaches we propose can achieve an acceleration of an order of magnitude while being as efficient as the exact gradient when used in algorithms such as OMP, Frank-Wolfe or CoSaMP

    PAC-Bayesian Generalization Bound on Confusion Matrix for Multi-Class Classification

    No full text
    In this paper, we propose a PAC-Bayes bound for the generalization risk of the Gibbs classifier in the multi-class classification framework. The novelty of our work is the critical use of the confusion matrix of a classifier as an error measure; this puts our contribution in the line of work aiming at dealing with performance measure that are richer than mere scalar criterion such as the misclassification rate. Thanks to very recent and beautiful results on matrix concentration inequalities, we derive two bounds showing that the true confusion risk of the Gibbs classifier is upper-bounded by its empirical risk plus a term depending on the number of training examples in each class. To the best of our knowledge, this is the first PAC-Bayes bounds based on confusion matrices. 1

    Integrating and reporting full multi-view supervised learning experiments using SuMMIT

    No full text
    SuMMIT (Supervised Multi Modal Integration Tool) is a software offering many functionalities for running, tuning, and analyzing experiments of supervised classification tasks specifically designed for multi-view data sets. SuMMIT is part of a platform 1 that aggregates multiple tools to deal with multiview datasets such as scikit-multimodallearn (Benielli et al., 2021) or MAGE (Bauvin et al., 2021). This paper presents use cases of SuMMIT, including hyper-parameters optimization, demonstrating the usefulness of such a platform for dealing with the complexity of multi-view benchmarking on an imbalanced dataset. SuMMIT is powered by Python3 and based on scikit-learn, making it easy to use and extend by plugging one's own specific algorithms, score functions or adding new features 2. By using continuous integration, we encourage collaborative development

    Multi-view Artificial Generation Engine: MAGE -- Générateur de données controlées pour l'apprentissage multivue

    No full text
    International audienceMulti-view learning has been a thriving research field for several years. Many approaches have been proposed based on multiple learning problems. However, to the best of our knowledge, most works propose their own definition of supervised multi-view learning and experimental frameworks. In order to give a more formal setting for multi-view learning with more than two views, we propose a toolbox for generating native multi-view datasets. Our contributions are twofold: first we propose 3 definitions of view interactions, then we introduce MAGE, a python based toolbox for dataset generation with view interaction. This allows one to empirically observe how fluctuates the accuracy of various multi-view learning algorithms according to the levels of interactions (e.g., correlation, complementarity) between the views. Theoretical and empirical justifications are provided for each contribution

    Sample Boosting Algorithm (SamBA) -An Interpretable Greedy Ensemble Classifier Based On Local Expertise For Fat Data

    No full text
    Ensemble methods are a very diverse family of algorithms with a wide range of applications. One of the most commonly used is boosting, with the prominent Adaboost. Adaboost relies on greedily learning base classifiers that rectify the error from previous iterations. Then, it combines them through a weighted majority vote, based on their quality on the entire learning set. In this paper, we propose a supervised binary classification framework that propagates the local knowledge acquired during the boosting iterations to the prediction function. Based on this general framework, we introduce SamBA, an interpretable greedy ensemble method designed for fat datasets, with a large number of dimensions and a small number of samples. SamBA learns local classifiers and combines them, using a similarity function, to optimize its efficiency in data extraction. We provide a theoretical analysis of SamBA, yielding convergence and generalization guarantees. In addition, we highlight SamBA's empirical behavior in an extensive experimental analysis on both real biological and generated datasets, comparing it to state-of-the-art ensemble methods and similarity-based approaches

    Pruning Random Forest with Orthogonal Matching Trees

    No full text
    In this paper we propose a new method to reduce the size of Breiman's Random Forests. Given a Random Forest and a target size, our algorithm builds a linear combination of trees which minimizes the training error. Selected trees, as well as weights of the linear combination are obtained by mean of the Orthogonal Matching Pursuit algorithm. We test our method on many public benchmark datasets both on regression and binary classification and we compare it to other pruning techniques. Experiments show that our technique performs significantly better or equally good on many datasets 1. We also discuss the benefit and shortcoming of learning weights for the pruned forest which lead us to propose to use a non-negative constraint on the OMP weights for better empirical results
    corecore